AITopics | likelihood ratio test

Collaborating Authors

likelihood ratio test

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Model Class Selection

Cecil, Ryan, Mentch, Lucas

arXiv.org Machine LearningNov-17-2025

Classical model selection seeks to find a single model within a particular class that optimizes some pre-specified criteria, such as maximizing a likelihood or minimizing a risk. More recently, there has been an increased interest in model set selection (MSS), where the aim is to identify a (confidence) set of near-optimal models. Here, we generalize the MSS framework further by introducing the idea of model class selection (MCS). In MCS, multiple model collections are evaluated, and all collections that contain at least one optimal model are sought for identification. Under mild conditions, data splitting based approaches are shown to provide general solutions for MCS. As a direct consequence, for particular datasets we are able to investigate formally whether classes of simpler and more interpretable statistical models are able to perform on par with more complex black-box machine learning models. A variety of simulated and real-data experiments are provided.

artificial intelligence, correspond, machine learning, (19 more...)

arXiv.org Machine Learning

2511.11355

Country: North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

On robust hypothesis testing with respect to Hellinger distance

Modak, Eeshan

arXiv.org Machine LearningNov-14-2025

We study the hypothesis testing problem where the observed samples need not come from either of the specified hypotheses (distributions). In such a situation, we would like our test to be robust to this misspecification and output the distribution closer in Hellinger distance. If the underlying distribution is close to being equidistant from the hypotheses, then this would not be possible. Our main result is quantifying how close the underlying distribution has to be to either of the hypotheses. We also study the composite testing problem, where each hypothesis is a Hellinger ball around a fixed distribution. A generalized likelihood ratio test is known to work for this problem. We give an alternate test for the same.

artificial intelligence, hellinger distance, scientific discovery, (16 more...)

arXiv.org Machine Learning

2510.1675

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > India > Maharashtra > Mumbai (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.62)

Add feedback

Sandbagging in a Simple Survival Bandit Problem

Dyer, Joel, Ornia, Daniel Jarne, Bishop, Nicholas, Calinescu, Anisoara, Wooldridge, Michael

arXiv.org Machine LearningOct-1-2025

Evaluating the safety of frontier AI systems is an increasingly important concern, helping to measure the capabilities of such models and identify risks before deployment. However, it has been recognised that if AI agents are aware that they are being evaluated, such agents may deliberately hide dangerous capabilities or intentionally demonstrate suboptimal performance in safety-related tasks in order to be released and to avoid being deactivated or retrained. Such strategic deception - often known as "sandbagging" - threatens to undermine the integrity of safety evaluations. For this reason, it is of value to identify methods that enable us to distinguish behavioural patterns that demonstrate a true lack of capability from behavioural patterns that are consistent with sandbagging. In this paper, we develop a simple model of strategic deception in sequential decision-making tasks, inspired by the recently developed survival bandit framework. We demonstrate theoretically that this problem induces sandbagging behaviour in optimal rational agents, and construct a statistical test to distinguish between sandbagging and incompetence from sequences of test scores. In simulation experiments, we investigate the reliability of this test in allowing us to distinguish between such behaviours in bandit models. This work aims to establish a potential avenue for developing robust statistical procedures for use in the science of frontier model evaluations.

agent, evaluation period, surv, (15 more...)

arXiv.org Machine Learning

2509.26239

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

As Good as a Coin Toss: Human Detection of AI-Generated Content

Communications of the ACMSep-22-2025, 18:17:15 GMT

Membership in ACM includes a subscription to Communications of the ACM (CACM), the computing industry's most trusted source for staying connected to the world of advanced computing. With only a 50-50 chance of detecting synthetic media online, users are more vulnerable than ever to being duped. Advances in generative AI technology have made it easier than ever for anyone to manufacture increasingly realistic synthetic media (colloquially known as deepfakes) at faster speeds, larger scales, and with more customization than ever. This in turn has led to synthetic media increasingly being used for harmful purposes, including disinformation campaigns, nonconsensual pornography, financial fraud, child sexual abuse and exploitation, and espionage. As of today, the principal defense to combat deceptive synthetic media depends in large part on the human observer's perceptual detection capabilities--their ability to visually or auditorily identify AI-generated content when they encounter it. Yet the growing realism of synthetic media impedes this ability, heightening people's vulnerability to weaponized synthetic content. Moreover, people overestimate how capable they are at identifying synthetic media, further exacerbating the problem. As synthetic media continues to advance in sophistication, so too does the threat posed by its growing weaponization, from financial fraud to the production of nonconsensual intimate materials of adults and children.

detection performance, stimuli, synthetic media, (14 more...)

Communications of the ACM

Country:

North America > United States > District of Columbia > Washington (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area (1.00)
Media > News (0.89)
Information Technology > Security & Privacy (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.55)

Add feedback

Nonlinear Causal Discovery through a Sequential Edge Orientation Approach

Huang, Stella, Zhou, Qing

arXiv.org Machine LearningJun-9-2025

Recent advances have established the identifiability of a directed acyclic graph (DAG) under additive noise models (ANMs), spurring the development of various causal discovery methods. However, most existing methods make restrictive model assumptions, rely heavily on general independence tests, or require substantial computational time. To address these limitations, we propose a sequential procedure to orient undirected edges in a completed partial DAG (CPDAG), representing an equivalence class of DAGs, by leveraging the pairwise additive noise model (PANM) to identify their causal directions. We prove that this procedure can recover the true causal DAG assuming a restricted ANM. Building on this result, we develop a novel constraint-based algorithm for learning causal DAGs under nonlinear ANMs. Given an estimated CPDAG, we develop a ranking procedure that sorts undirected edges by their adherence to the PANM, which defines an evaluation order of the edges. To determine the edge direction, we devise a statistical test that compares the log-likelihood values, evaluated with respect to the competing directions, of a sub-graph comprising just the candidate nodes and their identified parents in the partial DAG. We further establish the structural learning consistency of our algorithm in the large-sample limit. Extensive experiments on synthetic and real-world datasets demonstrate that our method is computationally efficient, robust to model misspecification, and consistently outperforms many existing nonlinear DAG learning methods.

artificial intelligence, machine learning, undirected edge, (16 more...)

arXiv.org Machine Learning

2506.0559

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Virginia > Arlington County > Arlington (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
(2 more...)

Add feedback

Network Causal Effect Estimation In Graphical Models Of Contagion And Latent Confounding

Wu, Yufeng, Bhattacharya, Rohit

arXiv.org Machine LearningNov-2-2024

A key question in many network studies is whether the observed correlations between units are primarily due to contagion or latent confounding. Here, we study this question using a segregated graph (Shpitser, 2015) representation of these mechanisms, and examine how uncertainty about the true underlying mechanism impacts downstream computation of network causal effects, particularly under full interference -- settings where we only have a single realization of a network and each unit may depend on any other unit in the network. Under certain assumptions about asymptotic growth of the network, we derive likelihood ratio tests that can be used to identify whether different sets of variables -- confounders, treatments, and outcomes -- across units exhibit dependence due to contagion or latent confounding. We then propose network causal effect estimation strategies that provide unbiased and consistent estimates if the dependence mechanisms are either known or correctly inferred using our proposed tests. Together, the proposed methods allow network effect estimation in a wider range of full interference scenarios that have not been considered in prior work. We evaluate the effectiveness of our methods with synthetic data and the validity of our assumptions using real-world networks.

artificial intelligence, bidirected edge, machine learning, (17 more...)

arXiv.org Machine Learning

2411.01371

Country:

North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia (0.04)

Genre: Research Report > Experimental Study (0.67)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Add feedback

A Likelihood Ratio Test of Genetic Relationship among Languages

Akavarapu, V. S. D. S. Mahesh, Bhattacharya, Arnab

arXiv.org Artificial IntelligenceMar-30-2024

Lexical resemblances among a group of languages indicate that the languages could be genetically related, i.e., they could have descended from a common ancestral language. However, such resemblances can arise by chance and, hence, need not always imply an underlying genetic relationship. Many tests of significance based on permutation of wordlists and word similarity measures appeared in the past to determine the statistical significance of such relationships. We demonstrate that although existing tests may work well for bilateral comparisons, i.e., on pairs of languages, they are either infeasible by design or are prone to yield false positives when applied to groups of languages or language families. To this end, inspired by molecular phylogenetics, we propose a likelihood ratio test to determine if given languages are related based on the proportion of invariant character sites in the aligned wordlists applied during tree inference. Further, we evaluate some language families and show that the proposed test solves the problem of false positives. Finally, we demonstrate that the test supports the existence of macro language families such as Nostratic and Macro-Mayan.

hypothesis, linguistics, significance, (15 more...)

arXiv.org Artificial Intelligence

2404.00284

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Kansas (0.04)
(9 more...)

Genre: Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)

Add feedback

Heavy-tailed Distances for Gradient Based Image Descriptors

Neural Information Processing SystemsMar-15-2024, 02:43:52 GMT

Many applications in computer vision measure the similarity between images or image patches based on some statistics such as oriented gradients. These are often modeled implicitly or explicitly with a Gaussian noise assumption, leading to the use of the Euclidean distance when comparing image descriptors. In this paper, we show that the statistics of gradient based image descriptors often follow a heavy-tailed distribution, which undermines any principled motivation for the use of Euclidean distances. We advocate for the use of a distance measure based on the likelihood ratio test with appropriate probabilistic models that fit the empirical data distribution. We instantiate this similarity measure with the Gammacompound-Laplace distribution, and show significant improvement over existing distance measures in the application of SIFT feature matching, at relatively low computational cost.

descriptor, distance measure, statistics, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Add feedback

Near-optimal Anomaly Detection in Graphs using Lovász Extended Scan Statistic

Neural Information Processing SystemsMar-13-2024, 18:07:30 GMT

The detection of anomalous activity in graphs is a statistical problem that arises in many applications, such as network surveillance, disease outbreak detection, and activity monitoring in social networks. Beyond its wide applicability, graph structured anomaly detection serves as a case study in the difficulty of balancing computational complexity with statistical power. In this work, we develop from first principles the generalized likelihood ratio test for determining if there is a well connected region of activation over the vertices in the graph in Gaussian noise. Because this test is computationally infeasible, we provide a relaxation, called the Lovász extended scan statistic (LESS) that uses submodularity to approximate the intractable generalized likelihood ratio. We demonstrate a connection between LESS and maximum a-posteriori inference in Markov random fields, which provides us with a poly-time algorithm for LESS. Using electrical network theory, we are able to control type 1 error for LESS and prove conditions under which LESS is risk consistent. Finally, we consider specific graph models, the torus, k-nearest neighbor graphs, and ǫ-random graphs. We show that on these graphs our results provide near-optimal performance by matching our results to known lower bounds.

graph, statistic, statistics, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Industry:

Health & Medicine > Epidemiology (0.54)
Energy > Power Industry (0.34)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Benchmarking optimality of time series classification methods in distinguishing diffusions

Zhang, Zehong, Lu, Fei, Fei, Esther Xu, Lyons, Terry, Kevrekidis, Yannis, Woolf, Tom

arXiv.org Artificial IntelligenceApr-11-2023

Statistical optimality benchmarking is crucial for analyzing and designing time series classification (TSC) algorithms. This study proposes to benchmark the optimality of TSC algorithms in distinguishing diffusion processes by the likelihood ratio test (LRT). The LRT is an optimal classifier by the Neyman-Pearson lemma. The LRT benchmarks are computationally efficient because the LRT does not need training, and the diffusion processes can be efficiently simulated and are flexible to reflect the specific features of real-world applications. We demonstrate the benchmarking with three widely-used TSC algorithms: random forest, ResNet, and ROCKET. These algorithms can achieve the LRT optimality for univariate time series and multivariate Gaussian processes. However, these model-agnostic algorithms are suboptimal in classifying high-dimensional nonlinear multivariate time series. Additionally, the LRT benchmark provides tools to analyze the dependence of classification accuracy on the time length, dimension, temporal sampling frequency, and randomness of the time series.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2301.13112

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback